keywords:"Nevyvážená data" - Výsledky hledání - Digitální repozitář

host :: přihlásit Digitální repozitář
		Hledej		Nový záznam		Nápověda		O repozitáři

Hlavní stránka > Výsledky hledání: keywords:"Nevyvážená data"

Hledej:

Tipy pro vyhledávaní :: Rozšířené hledání

Hledej ve sbírkách:

Seřadit podle:	Zobrazit výsledky:	Výstupní formát:

	Classification on unbalanced data Hlosta, Martin ; Popelínský, Lubomír (oponent) ; Štěpánková,, Olga (oponent) ; Zendulka, Jaroslav (vedoucí práce) This thesis is focused on classification on unbalanced data. It is an important part of machine learning with the objective to address the issues when one class is significantly underrepresented compared to the other one. The minority class is usually more important, and the traditional algorithms favouring the majority class may ignore the importance of the minority class. Two application domains motivated the research and identification of two specific problems of the imbalanced data. First, the presence of a constraint on the performance of a minority class in the computer security domain resulted in the formulation of the constrained classification problem. I proposed a solution that combines the cost-sensitive logistic regression and stochastic algorithms, which in the conducted experiments always improved the performance of the logistic regression.The domain of Learning Analytics motivated me to define a general prediction problem, whether a goal is has been achieved within the deadline. I designed the Self-Learning framework, in which models are trained by analysing attributes of objects that achieved the goal early in the investigated period. Because only a few objects satisfy the goal at the beginning, the problem is by its nature imbalanced, with the imbalance decreasing in time. The evaluation, performed on the task of identification of at-risk students in the distance higher education, showed (1) the predictive power compared the specified baseline models and (2) that methods for tackling the class imbalance without domain information didn't lead to significant improvements. When the domain information is utilised in the extended version of Self-Learning, the evaluation showed the performance increase. Understanding and exploiting the source of imbalance can also lead to better results. Úplný záznam
	Segmentace obrazu nevyvážených dat pomocí umělé inteligence Polách, Michal ; Rajnoha, Martin (oponent) ; Kolařík, Martin (vedoucí práce) Tato práce se zaměřuje na problematiku segmentace nevyvážených dat pomocí uměléinteligence. V práci jsou prozkoumány známé metody pro vypořádání se s nevyváženýmidaty, z nichž jsou vybrány vhodné metody, a ty jsou aplikovány na reálný problém, vekterém je cílem segmentovat nevyvážená data s poměrem tříd větším než 6000:1. Úplný záznam
	Machine Learning Methods in Payment Card Fraud Detection Sinčák, Jan ; Baruník, Jozef (vedoucí práce) ; Vácha, Lukáš (oponent) Ochrana klientů před podvodnými transakcemi je náročný úkol. Banky se ob- vykle spoléhají na systémy založené na pravidlech, které vyžadují ruční tvorbu těchto pravidel pro identifikaci podvodu. Tato pravidla musí nastavit zaměst- nanci banky, kteří musí sami vyhledávat trendy v podvodných transakcích. Tato práce se zabývá problémem odhalování podvodných karetních transakcí a porovnává několik modelů strojového učení pro detekci podvodů. Tyto mod- ely mohou v datech najít složité vztahy a potenciálně překonat klasické sys- témy detekce podvodů, Logistická regrese, neuronová síť, random forest a ex- treme gradient boosting (XGBoost) jsou trénovány na simulovaném souboru dat, který věrně kopíruje vlastnosti skutečných karetních transakcí. Výkon- nost modelů se měří podle citlivosti, specificity, preciznosti, AUC a časové náročnosti předpovědi na testovacím souboru dat. XGBoost vykazuje nejvyšší výkonnost mezi testovanými modely. Poté je porovnáván se standardním sys- témem detekce podvodů používaným v české bance. Bankovní systém dosahuje vyšší specificity, ale XGBoost přesto vykazuje slibné výsledky. Je možné, že některé modely strojového učení by mohly překonat současné systémy detekce podvodů, pokud budou dobře vyladěny. Klasifikace JEL G21, K42 Klíčová slova strojové učení, karetní podvody,... Úplný záznam
	Classification on unbalanced data Hlosta, Martin ; Popelínský, Lubomír (oponent) ; Štěpánková,, Olga (oponent) ; Zendulka, Jaroslav (vedoucí práce) This thesis is focused on classification on unbalanced data. It is an important part of machine learning with the objective to address the issues when one class is significantly underrepresented compared to the other one. The minority class is usually more important, and the traditional algorithms favouring the majority class may ignore the importance of the minority class. Two application domains motivated the research and identification of two specific problems of the imbalanced data. First, the presence of a constraint on the performance of a minority class in the computer security domain resulted in the formulation of the constrained classification problem. I proposed a solution that combines the cost-sensitive logistic regression and stochastic algorithms, which in the conducted experiments always improved the performance of the logistic regression.The domain of Learning Analytics motivated me to define a general prediction problem, whether a goal is has been achieved within the deadline. I designed the Self-Learning framework, in which models are trained by analysing attributes of objects that achieved the goal early in the investigated period. Because only a few objects satisfy the goal at the beginning, the problem is by its nature imbalanced, with the imbalance decreasing in time. The evaluation, performed on the task of identification of at-risk students in the distance higher education, showed (1) the predictive power compared the specified baseline models and (2) that methods for tackling the class imbalance without domain information didn't lead to significant improvements. When the domain information is utilised in the extended version of Self-Learning, the evaluation showed the performance increase. Understanding and exploiting the source of imbalance can also lead to better results. Úplný záznam
	Segmentace obrazu nevyvážených dat pomocí umělé inteligence Polách, Michal ; Rajnoha, Martin (oponent) ; Kolařík, Martin (vedoucí práce) Tato práce se zaměřuje na problematiku segmentace nevyvážených dat pomocí uměléinteligence. V práci jsou prozkoumány známé metody pro vypořádání se s nevyváženýmidaty, z nichž jsou vybrány vhodné metody, a ty jsou aplikovány na reálný problém, vekterém je cílem segmentovat nevyvážená data s poměrem tříd větším než 6000:1. Úplný záznam

Chcete být upozorněni, pokud se objeví nové záznamy odpovídající tomuto dotazu?
Přihlásit se k odběru RSS.

Digitální repozitář :: :: :: ::
Powered by v1.1.2
Spravuje

Tato stránka je dostupná také v následujících jazycích:
Česky English